Implement partition compaction grouper #6172

alexqyle · 2024-08-20T00:46:37Z

What this PR does:

This PR implements partition compaction grouper.

Introduced new files for partition compaction:

partitioned_group_info: This file acts like a compaction plan. It contains the information that how source blocks from compaction time range being assigned to partitions for compaction. partitionedGroupID in the file is unique for particular time range.
partition_visit_marker: Visit marker file for each partition under compaction. This could prevent multiple compactors from working on the same partition compaction. Similar to block visit marker.

Here is high level algorithm of partition compaction grouper:

Group blocks by time range
Load existing partitioned_group_info files
Gathering information of each time range and check which time range where grouper can take compaction job from
Create partitioned groups from grouped blocks
Sanitize partitions from each partitioned group
Return ready to compact partitioned groups to Thanos for compaction

Introduced meta_extensions to save partition information of result block in meta.json. This infomation can be used to better assign block to proper partition in the next round of compaction.

Which issue(s) this PR fixes:
NA

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Alex Le <[email protected]>

docs/configuration/config-file-reference.md

Signed-off-by: Alex Le <[email protected]>

pkg/compactor/compactor.go

docs/configuration/config-file-reference.md

Signed-off-by: Alex Le <[email protected]>

pkg/compactor/partition_compaction_grouper.go

pkg/util/validation/limits.go

danielblando · 2024-12-20T01:36:48Z

Overall looking good to me. Just a few comments.

For user to migrate. Would it just work changing the configuration and deploying? I would imagine yes as we would not find partition data to any block and treat them all as partitionId 0. Correct?

alexqyle · 2024-12-20T01:44:50Z

Overall looking good to me. Just a few comments.

For customer to migrate. Would it just work changing the configuration and deploying? I would imagine yes as we would not find partition data to any block and treat them all as partitionId 0. Correct?

Yes. Switching back and forth between partitioning and non partitioning should not cause any issue. At most, the largest time range block would be recompacted one more time.

danielblando · 2024-12-20T01:53:44Z

How it works while deployment is happening? Because we can have compactors creating blocks with partition and compactors creating others without and they are seeing different visit markers? Would it create duplicate compaction while deployment is happening?

alexqyle · 2024-12-20T04:56:13Z

How it works while deployment is happening? Because we can have compactors creating blocks with partition and compactors creating others without and they are seeing different visit markers? Would it create duplicate compaction while deployment is happening?

If both are compacting the largest time range blocks, it would create duplicate blocks. For any lower level blocks, it would be compacted into higher level properly after deployment.

Signed-off-by: Alex Le <[email protected]>

Implement partition compaction grouper

18d8cbc

Signed-off-by: Alex Le <[email protected]>

pull-request-size bot added the size/XXL label Aug 20, 2024

yeya24 reviewed Aug 20, 2024

View reviewed changes

docs/configuration/config-file-reference.md Outdated Show resolved Hide resolved

alexqyle added 3 commits August 19, 2024 17:55

fix comment

04b50a3

Signed-off-by: Alex Le <[email protected]>

replace level 1 compaction limits with ingestion replication factor

e408173

Signed-off-by: Alex Le <[email protected]>

fix doc

eb09a54

Signed-off-by: Alex Le <[email protected]>

danielblando reviewed Sep 30, 2024

View reviewed changes

pkg/compactor/compactor.go Show resolved Hide resolved

danielblando reviewed Sep 30, 2024

View reviewed changes

docs/configuration/config-file-reference.md Outdated Show resolved Hide resolved

alexqyle added 2 commits October 2, 2024 16:23

update compaction_visit_marker_timeout default value

8f34239

Signed-off-by: Alex Le <[email protected]>

Merge branch 'master' into partition-compaction-grouper

567cabe

Signed-off-by: Alex Le <[email protected]>

danielblando reviewed Dec 20, 2024

View reviewed changes

pkg/compactor/partition_compaction_grouper.go Show resolved Hide resolved

pkg/util/validation/limits.go Outdated Show resolved Hide resolved

update default value for compactor_partition_index_size_limit_in_bytes

baf2969

Signed-off-by: Alex Le <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement partition compaction grouper #6172

Implement partition compaction grouper #6172

alexqyle commented Aug 20, 2024 •

edited

Loading

danielblando commented Dec 20, 2024 •

edited

Loading

alexqyle commented Dec 20, 2024

danielblando commented Dec 20, 2024

alexqyle commented Dec 20, 2024

Implement partition compaction grouper #6172

Are you sure you want to change the base?

Implement partition compaction grouper #6172

Conversation

alexqyle commented Aug 20, 2024 • edited Loading

danielblando commented Dec 20, 2024 • edited Loading

alexqyle commented Dec 20, 2024

danielblando commented Dec 20, 2024

alexqyle commented Dec 20, 2024

alexqyle commented Aug 20, 2024 •

edited

Loading

danielblando commented Dec 20, 2024 •

edited

Loading